System Profiling and Green Capabilities for Large Scale and Distributed Infrastructures. (Profilage système et leviers verts pour les infrastructures distribuées à grande échelle)
نویسنده
چکیده
Nowadays, reducing the energy consumption of large scale and distributed infrastructures has truly become a challenge for both industry and academia. This is corroborated by the many efforts aiming to reduce the energy consumption of those systems. Initiatives for reducing the energy consumption of large scale and distributed infrastructures can without loss of generality be broken into hardware and software initiatives. Unlike their hardware counterpart, software solutions to the energy reduction problem in large scale and distributed infrastructures hardly result in real deployments. At the one hand, this can be justified by the fact that they are application oriented. At the other hand, their failure can be attributed to their complex nature which often requires vast technical knowledge behind proposed solutions and/or thorough understanding of applications at hand. This restricts their use to a limited number of experts, because users usually lack adequate skills. In addition, although subsystems including the memory are becoming more and more power hungry, current software energy reduction techniques fail to take them into account. This thesis proposes a methodology for reducing the energy consumption of large scale and distributed infrastructures. Broken into three steps known as (i) phase identification, (ii) phase characterization, and (iii) phase identification and system reconfiguration; our methodology abstracts away from any individual applications as it focuses on the infrastructure, which it analyses the runtime behaviour and takes reconfiguration decisions accordingly. The proposed methodology is implemented and evaluated in high performance computing (HPC) clusters of varied sizes through a Multi-Resource Energy Efficient Framework (MREEF). MREEF implements the proposed energy reduction methodology so as to leave users with the choice of implementing their own system reconfiguration decisions depending on their needs. Experimental results show that our methodology reduces the energy consumption of the overall infrastructure of up to 24% with less than 7% performance degradation. By taking into account all subsystems, our experiments demonstrate that the energy reduction problem in large scale and distributed infrastructures can benefit from more than “the traditional” processor frequency scaling. Experiments in clusters of varied sizes demonstrate that MREEF and therefore our methodology can easily be extended to a large number of energy aware clusters. The extension of MREEF to virtualized environments like cloud shows that the proposed methodology goes beyond HPC systems and can be used in many other computing environments.
منابع مشابه
Bringing Introspection into BlobSeer: Towards a Self-Adaptative Distributed Data Management System
Introspection is the prerequisite of an autonomic behavior, the first step towards a performance improvement and a resource-usage optimization for large-scale distributed systems. In Grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific informati...
متن کاملModelling river discharge for large drainage basins: from lumped to distributed approach
The paper presents an upscaled application of the HBV model to the German part of the Elbe drainage basin, and intercomparison of lumped and distributed versions of the model. The objectives of the work were (a) to check the model performance for large-scale basins, and (b) to compare the lumped and distributed versions of the model. Three versions of the HBV model, one lumped and two distribut...
متن کاملClouds: a New Playground for the XtreemOS Grid Operating System
The emerging cloud computing model has recently gained a lot of interest both from commercial companies and from the research community. XtreemOS is a distributed operating system for large-scale wide-area dynamic infrastructures spanning multiple administrative domains. XtreemOS, which is based on the Linux operating system, has been designed as a Grid operating system providing native support...
متن کاملOne step further in large-scale evaluations: the V-DS environment
Validating current and next generation of distributed systems targeting largescale infrastructures is a complex task. Several methodologies are possible. However, experimental evaluations on real testbeds are unavoidable in the life-cycle of a distributed middleware prototype. In particular, performing such real experiments in a rigorous way requires to benchmark developed prototypes at larger ...
متن کامل